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ABSTRACT 

Interpersonal problem solving skills allow engineers to prevent interpersonal difficulties more 
effectively and to manage conflict, both of which are critical to successful participation on teams. This 
research provides evidence that the CareerWISE online learning environment can improve those skills 
among women in engineering graduate programs. In a randomized controlled trial, N = 128 female 
doctoral students were randomly assigned to treatment and wait-list control (WLC) groups; treatment 
consisted of interacting for at least five hours with an online learning environment with comprehensive 
instruction in problem-solving applied to common interpersonal situations in the academy. A scenario- 
based assessment instrument measured participants’ ability to describe how they would apply that 
problem solving to a fictional scenario, and a rubric was used to score the responses. Results showed 
that treatment group members had better knowledge of interpersonal problem solving steps and were 
better able to describe how they would apply problem-solving skills to a relevant scenario. 
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INTRODUCTION 

Broadly defined, problem solving is a process that both identifies multiple alternatives for 
addressing a problematic situation and facilitates the selection of the option most likely to be 
effective from among the alternatives (D’Zurilla and Godfried 1971). Across disciplinary perspectives 
and application areas, it is considered a higher order cognitive skill, requiring more than factual 
recall or the application of a single component of knowledge or skill. For example, the two high¬ 
est levels of Bloom’s (1956) well-known taxonomy of cognitive skills deal with the ability to apply 
knowledge and skills to new domains and are often grouped together under the label of “problem 
solving” (Crooks 1988). In the field of cognitive informatics, which combines ideas from cognitive 
science and informatics to investigate the human mind and its relation to associated engineering 
applications, problem solving is considered to be at the highest cognitive layer of the brain (Wang 
and Chiew 2010). 

In engineering, problem solving is widely regarded as a skill necessary for adequate career preparation 
(e.g., Clough 2004; National Research Council 2007). The majority of related work within the engineer¬ 
ing education community focuses on the application of discipline specific skills to address technical 
problems (for recent examples, see Liberatore, Vestal, and Herring, 2012; Moreno, Ozogul, and Reisslein 
2011; Steif, Lobue, Kara, and Fay 2010). However, the development of effective interpersonal problem 
solving skills is also of great importance (Bancino and Zevalkink 2007; Shuman, Besterfield-Sacre, and 
McGourty 2005). Such skills allow more effective prevention and resolution of interpersonal difficul¬ 
ties. Among graduate students in particular, they also allow students to manage natural conflicts that 
can arise, for example, through participation in research teams or interactions with academic advisors 
(Bernstein 2011). 

Despite the importance of problem-solving strategies for addressing interpersonal challenges, 
effective and easily accessible strategies are not available for teaching and learning them. Recent work 
also indicates that faculty are uncertain about how to best teach these and other professional skills 
(Matusovich et al. 2012). One way to increase accessibility is to make content available in an online 
format, with which students are increasingly comfortable; in fact, the number of students taking at 
least one online course in the United States has now approached 6.7 million (Allen & Seaman, 2013). 
The online format makes it possible to appeal to different learning styles with diverse formats through 
which instructional content can be presented and through which learners can interact (Narciss, Proske, 
and Koerndle 2007). 

While recent years have shown a rise in the development, study, and usage of online environments 
and simulations to teach technical skills (e.g., Physics Education Technology Project; Perkins et al. 
2006; Wieman, Adams, and Perkins 2008), research into the use of online tools for teaching and 
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learning non-technical or non-procedural skills has been limited. While not developed for engineering 
students in particular, examples of efforts to teach professional skills in an online format have included 
the application of web-based immersive technology in preparing salespeople (Kass, Burke, Blevis, 
and Williamson 1994), advancing medical education, and training members of the military services in 
interpersonal leadership behaviors and cross-cultural awareness and communication skills (Campbell 
et al. 2011; Johnson 2010; Zbylut et al. 2007; Zbylut and Ward 2004). The purpose of this paper is to 
empirically investigate the effectiveness of the CareerWISE online learning environment targeting 
women in doctoral programs in the physical sciences and engineering and designed to positively 
affect interpersonal problem solving skills. 

Women who begin science, technology, engineering, and math (STEM) doctoral programs 
have already demonstrated their academic abilities and commitment to STEM careers. However, 
Bernstein (2011) and Bernstein and Russo (2008) suggest that, even so, some women still decide 
to change course and leave their PhD programs due to discouragement they experience while en¬ 
rolled in their doctoral programs. While institutional change efforts through programs like the NSF’s 
ADVANCE program, which funds efforts to address aspects of academic culture and institutional 
structure that may differentially affect STEM women, are critical for improving the policies, prac¬ 
tices, and environment for promoting gender equity, the CareerWISE online learning environment 
utilized in the study here takes a complementary approach, working to bolster personal resilience 
skills, including interpersonal problem-solving skills, and prepare women for the moments when 
they experience doubts. Notably, the focus on skill building amongst the women who utilize the 
resource in this study is not meant to imply that the women are deficient or that the difficulties 
they encounter within their academic environments are of their own making. In fact, a substantial 
portion of the training used in this study focuses on understanding the gendered and contextual 
aspects of a problematic situation. 

Results are presented from a randomized controlled trial (RCT) in which the intervention was 
interaction with the CareerWISE online resilience training program (https://careerwise.asu.edu). 
The online learning environment used in the study was developed by an interdisciplinary research 
program at a large public university in the Southwest that works to improve retention in STEM 
fields where women leave doctoral programs at a significantly higher rate than men (Ampaw and 
Jaeger 2012; Council of Graduate Schools 2008; Lott, Gardner, and Powers 2009). Study results 
are presented regarding the participants’ ability to describe both an appropriate process for 
interpersonal problem solving and how that process would be applied to a given scenario relevant 
to graduate students in engineering. The measurement instrument used in the study is discussed, 
along with the associated rubrics for evaluating responses and evidence to support the reliability 
and validity of the rubrics. 
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METHODS 


Intervention Description 

The CareerW/SE online learning environment utilized in this study is grounded in psychological 
theory and research on personal resilience and coping (Carver et al. 1989; Lazarus 1993; Lazarus and 
Folkman 1984; Maddi 2005; Masten 2001), self-efficacy and social-cognitive career theories (Bandura 
1986; Bandura 1997; Hackett and Betz 1981; Lent et al. 1994; Lent et al. 2000), and problem-solving 
processes and cognitive-behavioral interventions (Beck 1983; Pretzer and Beck 2007; D’Zurilla and 
Nezu 2008). The theoretical grounding for the online learning environment is detailed in Bernstein 
(2011). Throughout, CareerWISE content includes examples relevant to the target audience, drawn 
from the results of focus groups (Bernstein and Russo 2008) and individual interviews (Anderson- 
Rowland et al. 2007) as well as research on women’s experiences in science and engineering fields 
(Etzkowitz et al. 2000; Fox 2001; Preston 2004; Rosser 2004, 2012; Sonnert and Holton 1995; Valian 
1998, 2007; Settles et al. 2006; Seymour and Hewitt 1997; Xie and Shauman 2003; Wao et al. 2010). 

CareerWISE offers skill building around four areas found to be of common concern amongst the 
target audience (Bernstein and Russo 2008): handling difficult interactions with an advisor, jug¬ 
gling academic and personal commitments, navigating a climate that can be unfriendly to women 
and managing delays and setbacks that are common in the course of pursuing research. There are 
more than 250 content pieces in the online program, including over 160 video clips, 3 to 8-minutes 
in length, taken from interviews with women with STEM doctoral degrees about their experiences 
in graduate school and how they coped; over 50 informational and instructional modules on top¬ 
ics such as habits of thinking and self-talk, interpersonal communication styles, staying positive, 
expectations for and from an advisor, and recognizing sexism; self-tests and practice exercises; and 
links to other relevant professional organization and resources. 

Although the training has many related learning objectives that are theoretically and empirically 
predictive of persistence to degree completion (Bernstein 2011), one of the key learning goals is to 
positively affect the online program visitors’ ability to apply interpersonal problem solving skills. 
STEM graduate students are generally very familiar with problem solving models (e.g., the engi¬ 
neering design process), but they may not be familiar with how to apply such tools to handle non¬ 
technical situations that arise, for example, as a normal part of research team participation. As such, 
the CareerWISE learning environment focuses on the development of skills (e.g., considering others’ 
perspectives, managing emotional responses, and appropriately identifying stakeholders) needed 
to apply a problem-solving model to manage such interpersonal situations. The specific steps and 
sub-steps taught by the CareerWISE learning environment to manage interpersonal problems are 
shown in Table 1, along with an example of associated content for each of the steps. 
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Step 

Sub-steps 

Example CareerWISE Content 

Step 1: Assess the 

Step 1 A: List the facts 

Skill Module: learn how to 

problem 

Step IB: Understand yourself 

recognize what sets off your stress 
reactions and how these triggers 


Step 1C: Understand the context 

influence your interactions with 


Step ID: Collect missing information 

your environment 

Step 2: Specify 

Step 2A: Select your priority problem 

Skill Module: Learn how to turn 

the outcome you 
want 

Step 2B: Establish what’s under your control 

Step 2C: Set a concrete outcome 

ambiguous goals or problems into 
attainable objectives 

Step 3: Strategize 

Step 3A: Identify potential solution strategies 

Skill Module: learn how to identify 


Step 3B: Assess your skills 

Step 3C: Weigh strategies and select the best one 

Step 3D: Make a plan 

and build on your cognitive and 
emotional strengths 

Step 4: Execute 

Step 4A: Act on your plan 

Video clips from interviews with 

and evaluate 

Step 4B: Evaluate whether the desired outcome was achieved 

successful STEM women appear 
throughout the online program. 


Step 4C: Cycle back to Step 1 if the outcome was not achieved 

Sample topics include the following: 


Step 4D: Review what you have learned 

1. Persuading an advisor 

2. Isolation and a proactive 



solution 


Table 1. CareerWISE’s problem solving model. The two left columns of this table 


originally appeared in Bernstein (2011). 


Instrumentation 

Research shows that skills requiring higher order thinking are most appropriately assessed using 
open-ended performance based instruments (Jonsson and Svingy 2007; Sankar, Varma, and Raju 
2008). Moreover, assessments providing learners the opportunity to respond to realistic scenarios 
are particularly useful for the measurement of learning objectives that are difficult to measure with 
more traditional formats (e.g., Jonsson and Svingy 2007; Kranov et al. 2008; McMartin, McKenna, 
and Youssefi 2000). A drawback of such assessments is that they typically require the development 
of an effective scoring rubric for the evaluation of learner responses and the associated training of 
scorers who can apply the rubric reliably. An alternative, less labor-intensive assessment approach 
for such skill-based measurements would be to ask for participants to self-report their abilities, but 
such reports are typically inflated (e.g., Arlin 1976). 

Since the goal of the study was to evaluate whether an online learning environment could posi¬ 
tively affect participant ability to describe how they would apply interpersonal problem solving 
skills, an assessment approach was called for beyond what self-report data from rating scales can 
provide. Consequently, an open-ended, scenario based assessment was chosen, the benefits of 
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APSS Item Number 

Prompt 

Question 1 

List the generic steps you would use to tackle a personal or interpersonal problem. The steps 
need not be put in the context of any specific scenario. 

Question 2 

Imagine you are [main character from scenario’s name here]. How would you apply the steps and 
sub-steps you listed in Question 1? Please provide as much detail as possible with your response, and 
remember to consider that YOU are [main character from scenario’s name here] in the situation (e.g., 
we are not asking you to give advice to [main character from scenario’s name here]). 

Table 2. Prompts for each of the APSS Questions. 


which heavily outweighed the resource-related drawbacks. We discuss later the process that was 
followed for rubric development and the resulting measures of the rubric’s effectiveness. 

The Assessment of Problem Solving Skills (APSS) instrument contains two questions. The first 
asks the respondent to list the steps taught by the CareerWISE learning environment to tackle 
interpersonal problems, and the second, designed as a performance assessment, asks participants 
to describe how they would apply interpersonal problem solving skills to a realistic graduate 
student-based scenario. Two versions, A and B, of the assessment were developed, each presenting 
a different scenario. The exact prompts for each of the two items are shown in Table 2. 

The context for the scenarios used in Question 2 of each version of the APSS were derived from 
focus groups conducted with members of the target population (Bernstein and Russo 2008). The sce¬ 
narios themselves were written by an interdisciplinary team of graduate students and faculty including 
members from the fields of industrial engineering, bioengineering, physics, and biology (Bekki et al. 
2008). In the writing process, attention was paid to ensure the relevance of scenarios across multiple 
disciplines. Each scenario was also reviewed for content validity by two STEM faculty members who 
were not on the writing team and then revised accordingly. In the scenario presented in Version A of 
the APSS, the main character struggles to get timely feedback from her advisor on her research. In 
the scenario presented in Version B of the APSS, a student’s graduation plans are in jeopardy because 
she needs the signature of her absent advisor in order to schedule her dissertation defense. 

Participant Recruitment and Demographics 

Participants for the RCT were recruited from STEM doctoral programs nationwide and included 
female students in at least the second year of their doctoral studies. First year doctoral students were 
excluded from the study because they were less likely to have begun engagement with a research 
group. Specifically, women were recruited from the following fields in which women are typically most 
underrepresented: chemical engineering, civil engineering, electrical engineering, materials science, 
mechanical engineering, computer science, physics, applied physics, math, applied math, chemistry, 
astronomy, and geological sciences. 
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University Names 

Carnegie Mellon 

Georgia Institute of Technology 

MIT 

Northwestern 

Purdue 

Rutgers 

Texas A and M 

UC Davis 

UCLA 

University of Arizona 

University of CA - Berkeley 

University of Central Florida 

University of Cincinnati 

University of Colorado (Boulder) 

University of Illinois Urbana-Champagne 

University of Iowa 

University of Massachusetts, Amherst 

University of Michigan 

University of Pittsburgh 

University of Southern California 

University of Texas Austin 

University of Virginia 

Virginia Tech 



Table 3. Universities from which research participants were recruited. 


Recruitment relied on participant self-selection in response to fliers that were distributed elec¬ 
tronically and posted at the 23 universities shown in Table 3. These universities were selected be¬ 
cause the number of women in the targeted departments was at least 15 at the time of the study. 
An informational flier, with a request to distribute to female students the invitation to participate, 
was emailed to relevant department chairs at each of the universities and a follow-up phone call 
was made to confirm receipt. The recruitment efforts yielded 150 qualified participants, not all of 
whom were from the target universities; 128 completed the protocol. 

Background information about the RCT participants was obtained from a survey that asked 
standard demographic questions in addition to questions about academic status, career goals, and 
disciplinary research experiences. The mean age of participants was 27.31 years, and, on average, 
they had completed 3.91 years of doctoral study. More than half of the participants (57%) came from 
a science or math discipline, while the remainder (43%) came from engineering disciplines. Of note 
is that the proportion of participants in the study sample from engineering (vs. non-engineering) 
very closely matches that in the STEM population at the time of the study. Per National Science 
Foundation statistics (2004, 2010), 42% of the population of women STEM doctoral students are 
from engineering and 58% from math and science. Most participants (76.5%) reported English as 
their primary language, and 85.4% of participants reported their country of origin as the United 
States, China, India, or Canada. Finally, participants self-identified their race/ethnicity as 3% African 
American, 17.3% Asian or Pacific Islander, 4.5% Hispanic, 0.8% Native American or Alaska Native, 
69.9% Caucasian/White, 2.3% Arab American, and 0.8% multi-ethnic. 
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The choice to include the broader STEM population (vs. engineering only) was appropriate given that 
the nature of the doctoral student experiences is similar between the physical sciences and engineer¬ 
ing and that the underrepresentation of women extends beyond engineering to the physical sciences 
(per the National Science Foundation, 2011, only 34% of doctoral students in the physical science fields 
are women). Moreover, the choice of the wider population was necessary from a practical perspec¬ 
tive, as it increased the size of the population from which the research participants could be drawn. 

To further evaluate the appropriateness of drawing conclusions about the engineering students 
using the STEM sample, the scores of participants who identified themselves as having come from 
an engineering discipline were statistically compared to those in the sample of participants who 
identified themselves as being in a science or mathematics degree program. Specifically, on each of 
the two questions on the APSS, the scores for engineering participants and non-engineering partici¬ 
pants were compared using a one-way ANOVA. For participants who were in the WLC group, scores 
on the second administration of the APSS (following exposure to the CareerWISE online learning 
environment) were used as the responses in the analysis. As demonstrated by the results presented 
in Table 4, comparison of the engineering vs. non-engineering participants for both Question 1 (p = 
0.30) and Question 2 (p = 0.64) showed there was no statistical difference between groups. As 
such, we found it to be appropriate to draw conclusions from the study about engineering students 
based on data obtained from the entire science and engineering sample. 

Protocol 

The study followed a wait-list randomized controlled design, which is known to produce the 
strongest quality of evidence for evaluating an intervention’s effectiveness (US Department of Edu¬ 
cation 2009). In randomized controlled trials (RCTs), participants are randomly divided between 
treatment and control groups. The treatment group receives the intervention while the control 
group does not. Following the delivery of the intervention to the treatment group, comparisons are 
made between the treatment and control groups to measure the effects of the intervention. In the 




Mean 


ANOVA 

Question from the APSS 

Group 

Score 

SD 

p-value 

Question 1 

Engineering (N = 55) 

2.57 

0.79 

0.303 


Non-Engineering (N = 73) 

2.71 

0.74 


Question 2 

Engineering (N = 55) 

24.40 

4.47 

0.643 


Non-Engineering (N = 73) 

24.73 

3.34 



Table 4. Summary of ANOVA analyses comparing engineering and non-engineering 
participants. 
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wait-list controlled design employed here, the control group is then given the intervention, which 
allows those participants to also gain any associated positive impacts. 

The data and results reported here add to those previously reported (Bekki, Smith, Bernstein and Har¬ 
rison 2013) about the efficacy of the Caree/AA/ISE online learning environment. Measurement of participant 
learning in the RCT was carried out using two instruments, administered to participants at the same points 
in the study. The first, made up of seven ratings-type scales, was designed to capture self-report data 
from participants on learning outcomes such as resilience and coping efficacy. The second instrument 
was the Assessment of Problem Solving Skills (APSS), which is the source of the data reported here. The 
protocol, recruiting, and participant demographics are the same that were used in the broader RCT, but 
the instrumentation presented here is only related to the measurement and assessment of participant’s 
interpersonal problem solving skills, not other learning outcome measures. Further detail on the other 
outcomes assessed during the RCT can be found in (Bekki, Smith, Bernstein and Harrison 2013). 

In total, 128 participants successfully completed the RCT protocol. Figure 2 shows the specific process 
participants followed during the study. They first completed a demographic survey online and were then 
randomly assigned to either a wait-list control (WLC) or a treatment group. Of protocol completers, 
64 were in the WLC group, and 64 were in the treatment group. Important to the validity of an RCT 
study is the random assignment of participants to the WLC and treatment groups. Figure 1 illustrates 
the balance achieved on relevant demographic characteristics generated by this random assignment. 



■ WLC Group ■ Treatment Group 

Figure 1. Breakdown of relevant demographic characteristic amongst wait list control 
(WLC) and treatment group members. 
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Figure 2. Protocol followed by participants in the portion of the RCT reported here. 


The treatment group was then given access to the CareerWISE online learning environment and 
asked to spend at least five hours exploring it in an unconstrained manner during a two-week time 
frame. The unconstrained nature of the exploration meant that participants were directed neither to 
engage with any particular content in the online learning environment nor to interact with content in 
any particular order. This choice was made to mimic how the online learning environment would be 
used outside of the experimental setting and introduced the possibility that research study partici¬ 
pants might not visit a representative sample of content. The choice of five hours was made based 
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on considerations of the amount of content available on the online program; we wanted participants 
to have the opportunity to cover both the depth and breadth of the material, and we estimated that 
review of a reasonable proportion of the material would require at least five hours. Also of note is 
that this study was performed prior to the public release of the online learning environment, so 
we did not have regular visitor usage patterns available to us from which to generate a treatment 
length. Although we suspected that in any one visit, a user would spend a shorter amount of time, 
the online learning environment is designed to have repeated visits over an extended period of time, 
and this experimental design was intended to mimic this behavior. After completing the CareerWISE 
learning environment exploration, participants were given online access to Version A of the APSS. 

WLC group participants were given Version A of the APSS approximately two weeks after they 
submitted the demographic survey. They were then given access to the CareerWISE online learning 
environment and asked to spend at least five hours interacting with it, in the same unconstrained 
manner that the treatment group used. After completing their exploration, each WLC group par¬ 
ticipant was given online access to Version B of the APSS, which had the same format as Version A, 
but depicted a different scenario to which the participants were asked to respond. 

Notably, participants (in both the WLC and treatment groups) were instructed to complete 
the APSS without referring back to the online learning environment itself. After submission of the 
APSS (second APSS for the WLC group), participation was considered complete. As compensation 
for participation, both the WLC and treatment group members who completed the protocol were 
given continued access to the CareerWISE learning environment (not available to the public until 
five months after data collection was completed) along with a $50 gift card to a prominent online 
retailer. Completing participants spent a mean time of 5.39 hours on the online program. 

During the RCTs, frequent email communication was maintained with study participants. Reminders 
were sent for each of the key protocol activities, including survey submissions and online program 
exploration. For each activity, the final reminder indicated to the participant that she would be 
dropped from the study if she did not comply within the allotted time period. Participants who did 
not complete relevant tasks by the final deadline were informed via email that they had been dropped 
from the study. When participants were dropped from the study, the protocol was reinitiated with a 
back-up participant. Back-up participants were held to the same protocol as initial participants and 
were also dropped from the study if they did not complete activities in a timely fashion. In total, 43 
participants were dropped from the study after giving their consent to participate. Notably, of the 
43 who were dropped, only 19 actually completed the first step in the protocol (submission of the 
demographic survey). A comparison of demographic data between the 19 participants who dropped 
out of the study after beginning it and those who completed and those who completed the study 
did not reveal any notable differences. 
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SCORING RUBRIC 


In order to evaluate participant responses to the APSS, two scoring rubrics were created based 
on the CareerWISE learning objectives associated with the interpersonal problem solving model 
shown in Table 1. The first rubric included a coding scheme for scoring responses to Question 1 
(identical in both versions of the instrument) of the APSS. The coding scheme measured partici¬ 
pant ability to describe the interpersonal problem solving model that formed the basis for Career- 
WISE content. The second rubric, used in scoring Question 2, included 22 coding schemes which 
measured participants’ ability to describe how they would apply the problem solving model to a 
fictional but realistic scenario. Eleven of the coding schemes for Question 2 were for Version A of 
the APSS (one for each of the sub-steps contained within the first three steps shown in Table 1), and 
eleven were corresponding rubrics for Version B of the APSS. There is only a very small amount of 
specific content on the online program related to Step 4 of the problem solving model. As a result, 
the research team agreed that it would not be appropriate to score responses related to Step 4 of 
the problem solving model. 

Each of the coding schemes specified scores for participant responses on a four-point scale, and 
descriptive text was provided for each of the four scoring options to assist raters. Figure 3 gives 
an example of one of the coding schemes. The coding scheme in Figure 3 is the first draft of the 
coding scheme used to score responses to Question 2 of the APSS, with regard to problem solving 
sub-step 3C. 


Step 3C: WEIGH strategies and select the best one 

Score 

1 

2 

3 

4 


Does not choose a solution 
strategy. 

Selects a solution strategy 

BUT DOES NOT 

Selects a solution strategy 

AND 

Selects a solution strategy 

AND 



Describe the pros and cons 
of potential strategies. 

Describes the pros and cons 
of potential strategies. 

Describes the pros and cons 
of potential strategies. 



AND DOES NOT 

BUT DOES NOT 

AND 



Provide a description of the 
final decision making 
strategy (e.g., "went with 
my gut") that was used to 
select the chosen solution 

Provide a description of the 
final decision making 
strategy (e.g., "went with 
my gut") that was used to 
select the chosen solution 

Provides a description of the 
final decision making 
strategy (e.g., "went with 
my gut") that was used to 
select the chosen solution 

Figure 3. First draft of the rubric used for scoring Question 2, Step 3C, for participant 

responses to the Assessment of Problem Solving Skills used during the RCT. 
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Following initial development of the rubrics, a scoring team participated in a rater training led 
by a counseling psychology faculty member who is an expert in the interpersonal problem solving 
content. The purpose of the training was to increase rater understanding and skill in applying each 
of the coding schemes in an effort to optimize inter-rater reliability. Prior to the training session, 
each team member was provided with and asked to use the rubrics to score a selected set of re¬ 
sponses. During the training session, the raters compared scores to each response and discussed 
their reasons for assigning the score based on the rubric, with the goal of reaching consensus on 
the correct score for each participant response. 

Based on the discussions during the training session and on evaluation by an expert in interpersonal 
problem solving, refinements and improvements to the rubrics were incorporated. An example of the 
final coding scheme used to score responses to Question 2 of the APSS, with regard to problem solving 
step 3C (the draft coding scheme for this same step was provided earlier), is shown in Figure 4. 


Step 3C: WEIGH strategies and select the best one 


Score 

1 

2 

3 

4 

Descriptive 

Does not choose a 
solution strategy. 

One ofthe following 
conditions are met: 

Two ofthe following 
conditions are met: 

All ofthe following 
conditions are met: 



Selects a solution 
strategy 

Selects a solution 
strategy 

Selects a solution 
strategy 



OR 

AND/OR 

AND 



Describes the pros and 
cons of potential 
strategies. 

Describes the pros and 
cons of potential 
strategies. 

Describes the pros and 
cons of potential 
strategies. 



OR 

Provides a description 
of the final decision 
making strategy (e.g., 
'Vent with my gut") 
that was used to select 
the chosen solution 

OR 

Provides a description 
ofthe final decision 
making strategy (e.g., 
"went with my gut") 
that was used to select 
the chosen solution 

AND 

Provides a description 
of the final decision 
making strategy (e.g., 
"went with my gut") 
that was used to select 
the chosen solution 




Check box if answer meets each particular criterion: 

Selects a solution strategy 

LJ Describes the pros and cons of potential strategies. 

□ Provides a description ofthe final decision making strategy 

SCORE, 3C: 


Figure 4. Example of the final rubric used for scoring question 2, Step ZC, for participant 
responses to the problem solving assessment used during the RCT. 
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Assessment of Rubric Adequacy 

Following scorer training, each member of the scoring team used the updated version of the 
rubrics to score 70 randomly selected responses to Question 1 and 70 randomly selected responses 
to Question 2. Responses were selected from all available responses to the APSS and so included 
responses to both Versions A and B of the assessment instrument. Raters were not told whether a 
response was from a participant in the treatment group or wait list control group, and also could 
not determine which responses to Question 1 and Question 2 came from the same participant. 

The most common way to assess the adequacy of scoring rubrics for open-ended responses is 
to determine consensus-based inter-rater reliability (Stemler 2004), which measures the degree to 
which scorers agree on how to assign the various outcomes available according to the rubric. If the 
consensus inter-rater reliability is high enough for a large enough sample of the data, then scores 
by multiple scorers are considered equivalent, and only one scorer can rate additional data (vs. hav¬ 
ing multiple scorers look at 100% of the data). Krippendorff’s a (Krippendorff 2004) was used for 
this calculation. The selection of 70 responses to each question was made based on Krippendorff’s 
(2004, Table 11.2) recommendations for a sample size of at least 70 responses to accurately deter¬ 
mine whether a > 0.80 with a significance level of 0.05 when the rubric has four scoring outcomes. 

The inter-rater reliability, a, was calculated separately for Question 1 and Question 2 of the APSS. 
The score used in the calculation for responses to Question 2 was the sum of the scores across all 
the sub-steps. We acknowledge that the drawback of summing across all sub-steps is that it takes 
away the possibility for evaluating the impact of the online learning environment on specific steps/ 
components of the problem solving model. However, participants were allowed to explore the online 
learning environment in an unconstrained manner during the experimental treatment period, and 
tracking data collected during the study shows that the number of participants who actually viewed 
each particular content element (of the hundreds available) within the learning environment is not 
large enough to perform an analysis at the level of problem solving sub-step. 

Using the KALPHA macro in SPSS (Hayes and Krippendorff 2007), a 95% confidence interval sur¬ 
rounding a for each of the questions on the APSS was calculated. The resulting confidence intervals 
around a for Question 1 were 95% Cl [0.48, 0.58] and 95% Cl [0.71, 0.81] for Question 2. Krippendorff 
(2004, p. 241) recommends an a of at least 0.80 for drawing more than tentative conclusions. This 
value is included in the confidence interval surrounding the a value for Question 2, but is not within 
the confidence interval generated for Question 1. Also worth noting, however, is that Jonnson and 
Svingby (2004), who report on the results of an analysis of 75 peer-reviewed journal articles in which 
rubrics were used to evaluate responses to open-ended assessments, state that the majority of such 
assessments report a rater consensus of between 55 - 75%. While the reliability of Question 1 does 
not meet the standards suggested by Krippendorff (2004), the confidence interval surrounding a for 
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Question 1 does fall within the range reported by Jonnson and Svingby (2004), making it consistent 
with other published rubrics of this type. 

Additional evidence for the adequacy of an assessment tool is demonstrated by its validity, which 
considers whether the assessment actually measures what it was designed to measure (Moskal, 
Leydens, and Pavelich 2002). Expert opinion is often used as a measure of content validity (Stemler 
2004). As mentioned previously, the APSS was evaluated (and resulting modifications incorporated) 
by a domain expert before being disseminated to scorers for the scoring of the data used in the 
reliability calculations. Based on the outcome of the inter-rater reliability and validity evaluations, 
the scoring rubrics were considered to be appropriate for assessing the knowledge of and scenario- 
based application of interpersonal problem solving skills. 

The final scoring rubrics (available upon request to the authors) were applied in preparing the full 
data set for analysis. Responses to each question on the APSS not scored previously were divided 
amongst the trained scorers. Given the inter-rater reliability of the instrument, scoring of a single 
response by multiple scorers was deemed unnecessary. For data that had previously been scored 
by multiple scorers, the average score across all original scorers was calculated. Responses to Ques¬ 
tion 1 and Question 2 were treated as separate responses for analysis. Recall that for Question 2, a 
separate coding scheme was used to score responses with regard to each sub-step of the problem; 
in all analyses, the sum of the scores across all sub-steps of the problem solving model was used. 


RESULTS 

Presented next are the results of analyses conducted based on participant responses to the 
APSS. The results are based on one-way ANOVAs comparing WLC and treatment group members 
and paired t-tests showing the change in scores across the two administrations of the APSS to 
WLC members. They illustrate the effectiveness of the CareerWISE online learning environment in 
positively affecting participant ability to describe an effective interpersonal problem solving model 
and to describe how they would apply that model to a scenario depicting a common interpersonal 
challenge. Additional analyses are presented on the correlation between scores on the APSS and 
length of participant responses and on comparisons between the treatment group and WLC group 
on the length of responses. 

For each of the two responses (corresponding to scores from Question 1 and Question 2 of the 
APSS), a one-way ANOVA comparing the treatment and wait-list control groups was performed. 
Additionally, a paired t-test comparing the two administrations of the problem solving assessment 
to the wait-list control group was performed for each response. Results for significance, associated 
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> Points 




Range 

ANOVA 

Effect 


Possible 

Group 

Mean 

SD 

(Min, Max) 

p-value 

Size 

1 

4 

Treatment (N = 64) 

2.70 

0.68 

(1,4) 

0.005 

0.51 



WLC (N = 64) 

2.35 

0.71 

(1,4) 



2 

44 

Treatment (N = 64) 

25.16 

3.87 

(15,33) 

0.001 

0.58 



WLC (N = 64) 

22.91 

3.82 

(12, 30) 




Table 5. Summary of AN OVA analyses comparing APSS Scores between treatment and 
wait-list control groups. Effect sizes were calculated using Cohen's d. 


effect sizes are given in Table 5 and Table 6, along with summary statistics for each of these analy¬ 
ses. Effect sizes were calculated using Cohen’s d (Cohen 1988) and place the differences between 
groups in context. The statistical significance of both ANOVA analyses and associated effect sizes 
provide evidence supporting a conclusion that interpersonal problem solving skills can be taught 
and that exposure to even a brief intervention designed to affect those skills was able to generate 
both increased knowledge of the interpersonal problem solving steps and an improved ability by 
participants to describe how they would apply those skills to a fictional scenario. 

The paired t-test results shown in Table 6 compare APSS scores of WLC group participants in the 
study before and after their exposure to the online learning environment. Results include the mean 
and standard deviation of WLC participants’ scores on the APSS pre and post interaction with the 
online learning environment. The p-value and effect sizes are presented as the post-treatment ad¬ 
ministration of the APSS minus the pre-treatment administration. Consequently, for both Question 
1 and Question 2, participants were shown to score significantly higher (p < 0.05) on the second 
administration of the assessment. 

One of the intentions of the intervention is to move participants away from a problem solving 
approach that considers only a quick response to a technical problem toward a more thoughtful 
process that incorporates personal and self-management perspectives. To evaluate the effectiveness 


APSS Question 

Mean 

SD 

Mean 

SD 

p-value 

Effect 

(Pre) 

(Pre) 

(Post) 

(Post) 

Size 

Question 1 (N = 64) 

2.35 

0.71 

2.60 

0.84 

0.023 

0.30 

Question 2 (N = 64) 

22.91 

3.82 

24.02 

4.10 

0.049 

0.25 


Table 6. Summary of paired-t analyses comparing two administrations of the APSS 
within the wait-list control group. Results are presented as the difference between the 
second (post) and first (pre) administrations of the APSS. 
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with which this goal was achieved, an analysis of participant response length was conducted. First, 
the correlation between response length and score was determined to be r - 0.22 for Question 1 of 
the APSS and r = 0.70 for Question 2, indicating that, particularly for Question 2, longer responses 
received higher scores. The higher correlation between response length and score for Question 2 
was not unexpected, as Question 1 asks participants to simply list steps they would employ in tack¬ 
ling a personal or interpersonal problem. In comparison to technical problem-solving, the sub-steps 
associated with assessing the problem (see Table 1) include significant detail related to the specific 
interpersonal aspects of the problem. The second question of the APSS asks participants to consider 
and report how they would actually apply all the elements of their interpersonal problem solving 
strategy to a given scenario. Responses that scored higher on this question included more detailed 
considerations for the specific interpersonal interactions within the scenario. 

Finally, given that longer responses were correlated with better scores, an analysis was performed 
to understand whether the treatment group participants had statistically longer responses that those 
in the WLC group. A one-way ANOVA was performed comparing the number of words/response for 
each question. The results of this analysis are given in Table 7. They demonstrate that for Question 
2 of the APSS, the participants in the treatment group did have statistically longer responses than 
participants in the WLC group, providing additional support for the effectiveness of the Career\N\SE 
online learning environment. 


DISCUSSION 


The primary contribution of this paper is to provide empirical evidence in support of the notion 
that an online learning environment can positively impact the ability to apply the interpersonal 
problem solving skills fundamental to academic and career life among women engineering doctoral 
students. Such skills are theoretically linked to persistence amongst women graduate students 


I APSS Question 

Group 

Mean (Words/Response) 

SD (Words/Response) 

ANOVA p-value 1 

1 

Treatment (N = 64) 

128.81 

88.34 

0.127 


WLC (N = 64) 

105.86 

80.47 


2 

Treatment (N = 64) 

331.2 

178.4 

0.001 


WLC (N = 64) 

228.7 

161.9 



Table 7. Summary of ANOVA analyses comparing response lengths (in words/response) 


between treatment and wait-list control groups. 
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(Bernstein 2011), and while researchers have looked at methods to best teach and assess profes¬ 
sional skills (see, e.g., Kranov et al. 2008; Goh 2012; Mohan 2010; Shuman, Besterfield-Sacre, and 
McGourty 2005), engineering faculty remain unclear about how to address these competencies in 
the classroom (Matusovich et al. 2012). Such a resource, then, would also be useful as an individual 
resource for graduate students to use when they need or could benefit from it. 

Results were presented from a scenario-based assessment of interpersonal problem-solving skills 
in a randomized controlled trial with a nationwide sample of participants. The rubric used to assess 
participant responses in the RCT was shown to have sufficient inter-rater reliability and validity. 
Specifically, a 95% confidence interval around the inter-rater reliability for Question 1 of the scoring 
rubric was [0.48, 0.58], and for Question 2 was [0.71, 0.81]. We acknowledge we were surprised that 
Question 1, which is closer to a knowledge-recall type of question than Question 2 (and so might 
be expected to be easier to score), had a lower inter-rater reliability. We suggest the lower inter¬ 
rater reliability is a function of the rubric used to score Question 1, which was largely a function of 
how closely participants were able to recall and describe specific sub-steps of the interpersonal 
problem solving model offered in the online learning environment. Many participants offered steps 
not contained within the model presented in the online learning environment or listed a sub-step 
without enough detail to determine whether it was correct or not. We suggest that the inter-rater 
reliability for this question may have been improved if we had structured the prompt to be more 
focused, for example, by asking participants to list a particular number of sub-steps. 

Based on the comparison between treatment and WLC participants, participants in the treatment 
group were shown to have a statistically superior ability to describe the steps they would use to 
apply interpersonal problem solving skills (p <0.01). Moreover, and perhaps more importantly given 
the notion that what one can actually do is more important than what one can recall (Wiggins 1993), 
participants in the treatment group showed a statistically better ability to show how they would 
apply their interpersonal problem solving knowledge to a relevant, open ended scenario (p < 0.01). 

Positive effect sizes (Cohen’s d, 0.508 and 0.583 for Questions 1 and 2 respectively) associated 
with the analysis comparing treatment and wait-list control groups provided additional support for 
the results. These medium effect sizes (Cohen 1988) compare favorably to other studies in which the 
application of skills is assessed. For example, in a study designed to measure information literacy 
using a scenario-based assessment, Katz et al. (2008) reported an effect size of 0.42 (using Cohen’s 
d) when comparing the scores between first year and second year undergraduate students. In a meta¬ 
analysis of 40 papers, Gijbels et al. (2005) reported an average effect size of 0.34, computed using 
Glass’s delta, when comparing the impact of a problem based learning pedagogical technique with 
another technique in the ability to apply relevant knowledge and concepts. Similar results were given 
by Dochy et al. (2003) who reported in a meta-analysis an average effect size of 0.46, calculated 
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as the standardized mean difference (d-index) effect size, for the assessment of procedural knowl¬ 
edge when comparing problem based and lecture-based pedagogical styles. Finally, Tsaushu et al. 
(2012) reported an effect size of 0.24, calculated using Pearson’s r, when comparing scores on an 
open-ended question-based assessment for students being taught with two different pedagogical 
techniques in a biology class. The effect sizes for the present study present a valuable reference 
point and support for our results even though we did not compare two alternative treatments as in 
the cited examples. Worth noting is that a comparison with an alternate treatment was not actually 
possible in our study, as we are not aware of an available alternative to the treatment used. 

A paired t-test analysis comparing the two administrations of the APSS to the WLC group mem¬ 
bers was also presented. This analysis demonstrated significant improvements on both dimensions 
(ability to describe appropriate steps and ability to describe how they would apply those skills and 
steps to solving a relevant interpersonal issue) following exposure to the learning environment. As¬ 
sociated effect sizes from this analysis across both questions, although smaller in magnitude than 
those for the ANOVA analysis, provide further support for the effectiveness of the CareerWISE online 
learning environment in improving problem solving skills as assessed here. 

Finally, it is also worth repeating that participants in the study were allowed to explore the Career- 
WISE learning environment in an unconstrained manner. There was no requirement that participants 
actually visit all the parts of the resource dedicated to specific instruction on the interpersonal prob¬ 
lem solving model. This was a deliberate choice so that the experimental treatment would mirror the 
way that visitors to the online program would actually interact, searching based on their own interest 
rather than according to a step-by-step instruction. While all the content on CareerWISE relates at least 
indirectly to one or more of the steps suggested for interpersonal problems, only six (of hundreds of) 
pages on the online program explicitly introduce, describe one of the steps, or provide a quick overview 
of the steps themselves. These pages are highlighted on the front page of the CareerWISE online learn¬ 
ing environment in an effort to direct online program visitors there. However, tracking data collected 
during the RCT indicates that only 40% of the participants in the RCT actually visited all of those six 
pages during their online program exploration. Given that the majority of participants did not visit all 
the pages on the online program dedicated specifically to teaching it, the results indicating increased 
ability to describe how to apply interpersonal problem solving skills are even more compelling. 

While there has been progress in making academic environments in technical disciplines friendlier 
for female doctoral students, it remains important for the students to manage discouraging situa¬ 
tions in the moment and to anticipate future ones. This study demonstrates how an online learning 
environment can serve as a resource to expand and reinforce the personal assets of the women. In 
turn, they should more effectively be able to navigate interactions and solve the aspects of inter¬ 
personal problems that are under their control. 
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Implications 

This research has several important implications for engineering educators. First, this study 
provided support for the notion that interpersonal problem solving skills can be positively affected 
through interaction with an online learning environment. The online format of the educational resource 
under study provides a useful and scalable opportunity to provide relevant learning opportunities 
to graduate students worldwide. Additionally, the CareerWISE online learning environment provides 
a vehicle for faculty to take advantage of the promising “inverted classroom” technique (e.g, Lage, 
Platt, and Treglia 2000; Talbet 2012), allowing instruction on effective interpersonal problem solving 
to take place in a web-based environment outside the class setting, freeing the in-class time for 
more active learning activities. 

The availability of a resource to teach and learn interpersonal problem solving skills also has im¬ 
portant implications for teamwork amongst graduate students. Teamwork itself as well as the impact 
of gender on teamwork are widely acknowledged as important and studied at the undergraduate 
engineering level (e.g., Ingram and Parker 2002; Loughry, Ohland and Moore 2007; Wolfe and Powell, 
2009). Participation on teams that work toward a shared research goal is also a central component 
of graduate engineering education, however, (Amelink and Creamer 2010; Crede and Borrego 2012) 
and has been shown to be an important component for succeeding and persisting in graduate school 
(Rogers and Goktas 2010). Such teams are often made up of members from different disciplines, 
cultures, and academic levels (Chinowsky and Rojas 2003; Crede and Borrego 2012; McNair et al. 
2011). This diversity increases the complexity of team dynamics, underscoring the importance of 
effective interpersonal problem solving skills (Chinowsky and Rojas 2003; McNair et al. 2011). 

The use of open-ended assessments as a method for corroborating evaluation results obtained 
from self-report scales also provides important evidence for other online learning environment 
developers who are looking to demonstrate the effectiveness of their instructional materials. As 
described earlier, the research approach presented here provides different data and a new angle to a 
full RCT. A recent paper (Bekki, Smith, Bernstein and Harrison 2013) that provided RCT results based 
only on self-report, ratings-scale generated data, demonstrated the effectiveness of the resource in 
positively affecting a number of target learning outcomes including coping efficacy and resilience. 
The self-report data in the 2013 paper showed convincing evidence of the online program’s effec¬ 
tiveness, and many evaluation studies would have stopped at this point. However, the use of the 
skill-based, open-ended assessment detailed here provides a different type of support for the online 
program based on a direct measure of learning. While the resources required to administer such 
an assessment are clearly higher than would be necessary for a standard rating-scale-based (e.g., 
Likert) assessment, confidence in the results from an assessment like the APSS should be higher, 
particularly when the measured learning objectives include skill-based concepts. 
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Limitations and Future Work 

As with any study, this one is not without limitations. First, we note the assessments used in this 
study, while providing a meaningful check on whether participants know how to apply interpersonal 
problem solving, do not evaluate whether the skills are transferable. The findings are limited by the 
fact that results are based on participant responses to only a single scenario on one occasion of 
assessment, which took place immediately following interaction with the intervention. To secure a 
more extensive evaluation of whether participants maintained the skills, results from a far post¬ 
assessment would be required, at minimum. The APSS also does not evaluate whether participants 
are actually able to enact interpersonal problem solving skills in their own real life settings. Such an 
assessment would require observations of in situ interpersonal interactions, which would be unlikely 
to tap the kinds of difficult situations that happen to graduate students at unexpected moments. 
Moreover, although the APSS and associated rubrics could potentially be useful to other research¬ 
ers and educators for measuring interpersonal problem solving skills, a significant time commitment 
would be required to train and engage scorers in the use of the rubrics. 

We acknowledge as well that some bias may be present in the findings as a result of the 43 
participants who began, but did not complete, the study protocol. While no significant demo¬ 
graphic differences were identifiable between the 19 participants who dropped out of the study 
after beginning it and those who completed, the conclusions may be limited to populations who 
are more motivated to engage with the content. Finally, the examples included in the online pro¬ 
gram itself are intentionally designed to appeal to female and research-active doctoral students 
in STEM fields. As such, additional research would be required to determine whether members of 
other communities (e.g., undergraduate or male doctoral students in engineering) find the content 
useful and relevant to their own experiences. 

Future work will include alternative assessments of the intervention’s effectiveness such as follow¬ 
up studies with participants to better understand how they use the interpersonal problem solving 
knowledge gained from the online program. While such data would still be one step removed from an 
actual observation-based study, it would provide a self-reported indication of the transferability of skills 
learned from interaction with the online program to the actual environment of the target audience. Ad¬ 
ditionally, data mining approaches will be used to better understand how RCT participants’ trajectories 
through the online learning environment (e.g., in what order participants visited various pages within 
the online program, and for how long they were on each page) are related to their resultant learning. 
We anticipate this analysis will yield a better understanding of how individual content modules within 
the online program are related to the acquisition of interpersonal problem solving skills, and we will 
seek to identify differences in online program exploration patterns between participants who had 
high (vs. lower) scores on the APSS. Such analyses will provide a more nuanced understanding of how 
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participants obtained interpersonal problem solving skills (vs. the higher level question addressed in 
this study of whether they learned them). Finally, further research is planned to understand whether 
and how elements of perceived tangible support or external barriers in the participant’s proximal 
environment (e.g., characteristics of degree programs, relationship with advisor, consideration of home 
life, etc.) are related to learning of interpersonal problem solving skills. 
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